7 research outputs found

    Learning to win by reading manuals in a Monte-Carlo framework

    Get PDF
    This paper presents a novel approach for leveraging automatically extracted textual knowledge to improve the performance of control applications such as games. Our ultimate goal is to enrich a stochastic player with high-level guidance expressed in text. Our model jointly learns to identify text that is relevant to a given game state in addition to learning game strategies guided by the selected text. Our method operates in the Monte-Carlo search framework, and learns both text analysis and game strategies based only on environment feedback. We apply our approach to the complex strategy game Civilization II using the official game manual as the text guide. Our results show that a linguistically-informed game-playing agent significantly outperforms its language-unaware counterpart, yielding a 27% absolute improvement and winning over 78% of games when playing against the built-in AI of Civilization II.National Science Foundation (U.S.) (CAREER grant IIS-0448168)National Science Foundation (U.S.) (CAREER grant IIS-0835652)United States. Defense Advanced Research Projects Agency (DARPA Machine Reading Program (FA8750-09- C-0172))Microsoft Research (New Faculty Fellowship

    Non-Linear Monte-Carlo Search in Civilization II

    Get PDF
    This paper presents a new Monte-Carlo search algorithm for very large sequential decision-making problems. Our approach builds on the recent success of Monte-Carlo tree search algorithms, which estimate the value of states and actions from the mean outcome of random simulations. Instead of using a search tree, we apply non-linear regression, online, to estimate a state-action value function from the outcomes of random simulations. This value function generalizes between related states and actions, and can therefore provide more accurate evaluations after fewer simulations. We apply our Monte-Carlo search algorithm to the game of Civilization II, a challenging multi-agent strategy game with an enormous state space and around 102110^{21} joint actions. We approximate the value function by a neural network, augmented by linguistic knowledge that is extracted automatically from the official game manual. We show that this non-linear value function is significantly more efficient than a linear value function. Our non-linear Monte-Carlo search wins 80\% of games against the handcrafted, built-in AI for Civilization II.National Science Foundation (U.S.) (CAREER grant IIS-0448168)National Science Foundation (U.S.) (grant IIS-0835652)United States. Defense Advanced Research Projects Agency (DARPA Machine Reading Program (FA8750-09-C-0172))Microsoft Research (New Faculty Fellowship

    Learning Document-Level Semantic Properties from Free-Text Annotations

    Get PDF
    This paper presents a new method for inferring the semantic properties of documents by leveraging free-text keyphrase annotations. Such annotations are becoming increasingly abundant due to the recent dramatic growth in semi-structured, user-generated online content. One especially relevant domain is product reviews, which are often annotated by their authors with pros/cons keyphrases such as ``a real bargain'' or ``good value.'' These annotations are representative of the underlying semantic properties; however, unlike expert annotations, they are noisy: lay authors may use different labels to denote the same property, and some labels may be missing. To learn using such noisy annotations, we find a hidden paraphrase structure which clusters the keyphrases. The paraphrase structure is linked with a latent topic model of the review texts, enabling the system to predict the properties of unannotated documents and to effectively aggregate the semantic properties of multiple reviews. Our approach is implemented as a hierarchical Bayesian model with joint inference. We find that joint inference increases the robustness of the keyphrase clustering and encourages the latent topics to correlate with semantically meaningful properties. Multiple evaluations demonstrate that our model substantially outperforms alternative approaches for summarizing single and multiple documents into a set of semantically salient keyphrases

    Learning High-Level Planning from Text

    Get PDF
    Comprehending action preconditions and effects is an essential step in modeling the dynamics of the world. In this paper, we express the semantics of precondition relations extracted from text in terms of planning operations. The challenge of modeling this connection is to ground language at the level of relations. This type of grounding enables us to create high-level plans based on language abstractions. Our model jointly learns to predict precondition relations from text and to perform high-level planning guided by those relations. We implement this idea in the reinforcement learning framework using feedback automatically obtained from plan execution attempts. When applied to a complex virtual world and text describing that world, our relation extraction technique performs on par with a supervised baseline, yielding an F-measure of 66% compared to the baseline’s 65%. Additionally, we show that a high-level planner utilizing these extracted relations significantly outperforms a strong, text unaware baseline – successfully completing 80% of planning tasks as compared to 69% for the baseline.National Science Foundation (U.S.) (CAREER Grant IIS-0448168)United States. Defense Advanced Research Projects Agency. Machine Reading Program (FA8750-09-C-0172, PO#4910018860)Battelle Memorial Institute (PO#300662

    Good grief, i can speak it! Preliminary experiments in audio restaurant reviews

    Get PDF
    In this paper, we introduce a new envisioned application for speech which allows users to enter restaurant reviews orally via their mobile device, and, at a later time, update a shared and growing database of consumer-provided information about restaurants. During the intervening period, a speech recognition and NLP based system has analyzed their audio recording both to extract key descriptive phrases and to compute sentiment ratings based on the evidence provided in the audio clip. We report here on our preliminary work moving towards this goal. Our experiments demonstrate that multi-aspect sentiment ranking works surprisingly well on speech output, even in the presence of recognition errors. We also present initial experiments on integrated sentence boundary detection and key phrase extraction from recognition output

    WikiDo

    Get PDF
    Not formally publishedThe Internet has allowed collaboration on an unprecedented scale. Wikipedia, Luis Von Ahn’s ESP game, and reCAPTCHA have proven that tasks typically performed by expensive in-house or outsourced teams can instead be delegated to the mass of Internet computer users. These success stories show the opportunity for crowdsourcing other tasks, such as allowing computer users to help each other answer questions like “How do I make my computer do X?”. Such a system would reduce IT cost, user frustration, and machine downtime. The current approach to crowd-sourcing IT tasks, however, only allows users to collaborate on generating text. Anyone who goes through the process of searching help wikis and user forums hoping to find a solution for some computer problem knows the inefficacy and the frustration accompanying such a process. Text is ambiguous and often incomplete, particularly when written by non-experts. This paper presents WikiDo, a system that enables the mass of non-expert users to help each other answer how-to computer questions by actually performing the task rather than documenting its solution.National Science Foundation (U.S.) (grant IIS-0835652

    Content Modeling Using Latent Permutations

    No full text
    We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that this space of orderings can be effectively represented using a distribution over permutations called the Generalized Mallows Model. We apply our method to three complementary discourse-level tasks: cross-document alignment, document segmentation, and information ordering. Our experiments show that incorporating our permutation-based model in these applications yields substantial improvements in performance over previously proposed methods.Microsoft Faculty FellowshipNokiaQuantaUnited States. Offce of Naval ResearchNational Science Foundation (CAREER grant IIS-0448168 and grant IIS-0712793; and Graduate Fellowship
    corecore